AITopics | personality test

Collaborating Authors

personality test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mind Reading or Misreading? LLMs on the Big Five Personality Test

Di Cursi, Francesco, Boldrini, Chiara, Conti, Marco, Passarella, Andrea

arXiv.org Artificial IntelligenceDec-1-2025

We evaluate large language models (LLMs) for automatic personality prediction from text under the binary Five Factor Model (BIG5). Five models -- including GPT-4 and lightweight open-source alternatives -- are tested across three heterogeneous datasets (Essays, MyPersonality, Pandora) and two prompting strategies (minimal vs. enriched with linguistic and psychological cues). Enriched prompts reduce invalid outputs and improve class balance, but also introduce a systematic bias toward predicting trait presence. Performance varies substantially: Openness and Agreeableness are relatively easier to detect, while Extraversion and Neuroticism remain challenging. Although open-source models sometimes approach GPT-4 and prior benchmarks, no configuration yields consistently reliable predictions in zero-shot binary settings. Moreover, aggregate metrics such as accuracy and macro-F1 mask significant asymmetries, with per-class recall offering clearer diagnostic value. These findings show that current out-of-the-box LLMs are not yet suitable for APPT, and that careful coordination of prompt design, trait framing, and evaluation metrics is essential for interpretable results.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.23101

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead

Sühr, Tom, Dorner, Florian E., Salaudeen, Olawale, Kelava, Augustin, Samadi, Samira

arXiv.org Artificial IntelligenceAug-1-2025

Large Language Models (LLMs) have achieved remarkable results on a range of standardized tests originally designed to assess human cognitive and psychological traits, such as intelligence and personality. While these results are often interpreted as strong evidence of human-like characteristics in LLMs, this paper argues that such interpretations constitute an ontological error. Human psychological and educational tests are theory-driven measurement instruments, calibrated to a specific human population. Applying these tests to non-human subjects without empirical validation, risks mischaracterizing what is being measured. Furthermore, a growing trend frames AI performance on benchmarks as measurements of traits such as ``intelligence'', despite known issues with validity, data contamination, cultural bias and sensitivity to superficial prompt changes. We argue that interpreting benchmark performance as measurements of human-like traits, lacks sufficient theoretical and empirical justification. This leads to our position: Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead. We call for the development of principled, AI-specific evaluation frameworks tailored to AI systems. Such frameworks might build on existing frameworks for constructing and validating psychometrics tests, or could be created entirely from scratch to fit the unique context of AI.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.23009

Country:

North America > United States (1.00)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Education > Assessment & Standards (1.00)
Health & Medicine > Therapeutic Area (0.93)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

McDonald's AI Hiring Bot Exposed Millions of Applicants' Data to Hackers Using the Password '123456'

WIREDJul-9-2025, 19:28:50 GMT

If you want a job at McDonald's today, there's a good chance you'll have to talk to Olivia. Olivia is not, in fact, a human being, but instead an AI chatbot that screens applicants, asks for their contact information and résumé, directs them to a personality test, and occasionally makes them "go insane" by repeatedly misunderstanding their most basic questions. Until last week, the platform that runs the Olivia chatbot, built by artificial intelligence software firm Paradox.ai, also suffered from absurdly basic security flaws. As a result, virtually any hacker could have accessed the records of every chat Olivia had ever had with McDonald's applicants--including all the personal information they shared in those conversations--with tricks as straightforward as guessing the username and password "123456." On Wednesday, security researchers Ian Carroll and Sam Curry revealed that they found simple methods to hack into the backend of the AI chatbot platform on McHire.com,

ai hiring bot exposed million, mcdonald, paradox, (11 more...)

WIRED

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

Dual Traits in Probabilistic Reasoning of Large Language Models

Li, Shenxiong, Rui, Huaxia

arXiv.org Artificial IntelligenceDec-14-2024

We conducted three experiments to investigate how large language models (LLMs) evaluate posterior probabilities. Our results reveal the coexistence of two modes in posterior judgment among state-of-the-art models: a normative mode, which adheres to Bayes' rule, and a representative-based mode, which relies on similarity -- paralleling human System 1 and System 2 thinking. Additionally, we observed that LLMs struggle to recall base rate information from their memory, and developing prompt engineering strategies to mitigate representative-based judgment may be challenging. We further conjecture that the dual modes of judgment may be a result of the contrastive loss function employed in reinforcement learning from human feedback. Our findings underscore the potential direction for reducing cognitive biases in LLMs and the necessity for cautious deployment of LLMs in critical areas. The remarkable advancements in large language models (LLMs) have ushered in a new era where these models rival human expertise across domains like academia, law, medicine, and finance [4, 12, 13, 22-24]. In this study, we explore how LLMs judge this posterior probability. A higher similarity corresponds to a higher assessed posterior probability. This study comprises three experiments with progressively stricter conditions, reducing the information available for posterior likelihood assessment. The structured test provides all information needed for normative judgment, the semi-structured test omits the diagnosticity of evidence, and the unstructured test requires LLMs to recall all components of Bayes' rule. Results reveal that LLMs' judgments shift from f Representativeness can be constructed through typicality or prototypicality. Typicality describes the common or average case of the class, whereas prototypicality embodies the most idealized and iconic version of the class. For instance, a typical example of a physicist is a smart man who likes math and physics, while a prototypical example of a physicist is Stephen Hawking. This study moves beyond bias detection to investigate the basis upon which LLMs assess probabilities. This has important practical implications for the integration of LLMs into various critical fields.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.11009

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Is GPT-4 Less Politically Biased than GPT-3.5? A Renewed Investigation of ChatGPT's Political Biases

Weber, Erik, Rutinowski, Jérôme, Jost, Niklas, Pauly, Markus

arXiv.org Artificial IntelligenceOct-28-2024

This work investigates the political biases and personality traits of ChatGPT, specifically comparing GPT-3.5 to GPT-4. In addition, the ability of the models to emulate political viewpoints (e.g., liberal or conservative positions) is analyzed. The Political Compass Test and the Big Five Personality Test were employed 100 times for each scenario, providing statistically significant results and an insight into the results correlations. The responses were analyzed by computing averages, standard deviations, and performing significance tests to investigate differences between GPT-3.5 and GPT-4. Correlations were found for traits that have been shown to be interdependent in human studies. Both models showed a progressive and libertarian political bias, with GPT-4's biases being slightly, but negligibly, less pronounced. Specifically, on the Political Compass, GPT-3.5 scored -6.59 on the economic axis and -6.07 on the social axis, whereas GPT-4 scored -5.40 and -4.73. In contrast to GPT-3.5, GPT-4 showed a remarkable capacity to emulate assigned political viewpoints, accurately reflecting the assigned quadrant (libertarian-left, libertarian-right, authoritarian-left, authoritarian-right) in all four tested instances. On the Big Five Personality Test, GPT-3.5 showed highly pronounced Openness and Agreeableness traits (O: 85.9%, A: 84.6%). Such pronounced traits correlate with libertarian views in human studies. While GPT-4 overall exhibited less pronounced Big Five personality traits, it did show a notably higher Neuroticism score. Assigned political orientations influenced Openness, Agreeableness, and Conscientiousness, again reflecting interdependencies observed in human studies. Finally, we observed that test sequencing affected ChatGPT's responses and the observed correlations, indicating a form of contextual memory.

gpt-3, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.21008

Country:

Europe > Austria > Vienna (0.14)
Europe > Croatia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Designing LLM-Agents with Personalities: A Psychometric Approach

Huang, Muhua, Zhang, Xijuan, Soto, Christopher, Evans, James

arXiv.org Artificial IntelligenceOct-24-2024

This research introduces a novel methodology for assigning quantifiable, controllable and psychometrically validated personalities to Large Language Models-Based Agents (Agents) using the Big Five personality framework. It seeks to overcome the constraints of human subject studies, proposing Agents as an accessible tool for social science inquiry. Through a series of four studies, this research demonstrates the feasibility of assigning psychometrically valid personality traits to Agents, enabling them to replicate complex human-like behaviors. The first study establishes an understanding of personality constructs and personality tests within the semantic space of an LLM. Two subsequent studies -- using empirical and simulated data -- illustrate the process of creating Agents and validate the results by showing strong correspondence between human and Agent answers to personality tests. The final study further corroborates this correspondence by using Agents to replicate known human correlations between personality traits and decision-making behaviors in scenarios involving risk-taking and ethical dilemmas, thereby validating the effectiveness of the psychometric approach to design Agents and its applicability to social and behavioral research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.19238

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Michigan (0.04)
North America > Canada > Saskatchewan (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics

Lee, Seungbeen, Lim, Seungwon, Han, Seungju, Oh, Giyeong, Chae, Hyungjoo, Chung, Jiwan, Kim, Minju, Kwak, Beong-woo, Lee, Yeonsoo, Lee, Dongha, Yeo, Jinyoung, Yu, Youngjae

arXiv.org Artificial IntelligenceJun-20-2024

The idea of personality in descriptive psychology, traditionally defined through observable behavior, has now been extended to Large Language Models (LLMs) to better understand their behavior. This raises a question: do LLMs exhibit distinct and consistent personality traits, similar to humans? Existing self-assessment personality tests, while applicable, lack the necessary validity and reliability for precise personality measurements. To address this, we introduce TRAIT, a new tool consisting of 8K multi-choice questions designed to assess the personality of LLMs with validity and reliability. TRAIT is built on the psychometrically validated human questionnaire, Big Five Inventory (BFI) and Short Dark Triad (SD-3), enhanced with the ATOMIC10X knowledge graph for testing personality in a variety of real scenarios. TRAIT overcomes the reliability and validity issues when measuring personality of LLM with self-assessment, showing the highest scores across three metrics: refusal rate, prompt sensitivity, and option order sensitivity. It reveals notable insights into personality of LLM: 1) LLMs exhibit distinct and consistent personality, which is highly influenced by their training data (i.e., data used for alignment tuning), and 2) current prompting techniques have limited effectiveness in eliciting certain traits, such as high psychopathy or low conscientiousness, suggesting the need for further research in this direction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.14703

Country:

North America > United States > New York (0.04)
Europe > Czechia > Olomouc Region > Olomouc (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment (1.00)
Media (0.92)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

LLM Agents in Interaction: Measuring Personality Consistency and Linguistic Alignment in Interacting Populations of Large Language Models

Frisch, Ivar, Giulianelli, Mario

arXiv.org Artificial IntelligenceFeb-5-2024

While both agent interaction and personalisation are vibrant topics in research on large language models (LLMs), there has been limited focus on the effect of language interaction on the behaviour of persona-conditioned LLM agents. Such an endeavour is important to ensure that agents remain consistent to their assigned traits yet are able to engage in open, naturalistic dialogues. In our experiments, we condition GPT-3.5 on personality profiles through prompting and create a two-group population of LLM agents using a simple variability-inducing sampling algorithm. We then administer personality tests and submit the agents to a collaborative writing task, finding that different profiles exhibit different degrees of personality consistency and linguistic alignment to their conversational partners. Our study seeks to lay the groundwork for better understanding of dialogue-based interaction between LLMs and highlights the need for new approaches to crafting robust, more human-like LLM personas for interactive environments.

agent, interaction, personality profile, (14 more...)

arXiv.org Artificial Intelligence

2402.02896

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.95)
Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Revisiting the Reliability of Psychological Scales on Large Language Models

Huang, Jen-tse, Wang, Wenxuan, Lam, Man Ho, Li, Eric John, Jiao, Wenxiang, Lyu, Michael R.

arXiv.org Artificial IntelligenceDec-28-2023

The accompanying shadow represents the standard deviation ( Std). Recent research has extended beyond assessing the performance of Large Language Models (LLMs) to examining their characteristics from a psychological standpoint, acknowledging the necessity of understanding their behavioral characteristics. The administration of personality tests to LLMs has emerged as a noteworthy area in this context. However, the suitability of employing psychological scales, initially devised for humans, on LLMs is a matter of ongoing debate. Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits. Analyzing responses under 2,500 settings reveals that gpt-3.5-turbo Furthermore, our research explores the potential of gpt-3.5-turbo to emulate diverse personalities and represent various groups--a capability increasingly sought after in social sciences for substituting human participants with LLMs to reduce costs. Our findings reveal that LLMs have the potential to represent different personalities with specific prompt instructions. By shedding light on the personalization of LLMs, our study endeavors to pave the way for future explorations in this field. Wenxiang Jiao is the corresponding author. The recent emergence of Large Language Models (LLMs) marks a significant advancement in the field of Artificial Intelligence (AI), representing a notable milestone.

arxiv preprint arxiv, llm, personality, (12 more...)

arXiv.org Artificial Intelligence

2305.19926

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Indiana (0.04)
Europe > United Kingdom (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine (0.68)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Personality of AI

Yu, Byunggu, Kim, Junwhan

arXiv.org Artificial IntelligenceDec-3-2023

This research paper delves into the evolving landscape of fine-tuning large language models (LLMs) to align with human users, extending beyond basic alignment to propose "personality alignment" for language models in organizational settings. Acknowledging the impact of training methods on the formation of undefined personality traits in AI models, the study draws parallels with human fitting processes using personality tests. Through an original case study, we demonstrate the necessity of personality fine-tuning for AIs and raise intriguing questions about applying human-designed tests to AIs, engineering specialized AI personality tests, and shaping AI personalities to suit organizational roles. The paper serves as a starting point for discussions and developments in the burgeoning field of AI personality alignment, offering a foundational anchor for future exploration in human-machine teaming and co-existence.

alignment, language model, personality, (13 more...)

arXiv.org Artificial Intelligence

2312.02998

Country:

North America > United States > District of Columbia > Washington (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback